In this exercise, we will explore aspects of local or site-specific diversity, also known as alpha (\(\alpha\)) diversity. First we will quantify two of the fundamental components of (\(\alpha\)) diversity: richness and evenness. From there, we will then discuss ways to integrate richness and evenness, which will include univariate metrics of diversity along with an investigation of the species abundance distribution (SAD).
AlphaDiversity_Worskheet.Rmd and the PDF output of Knitr (AlphaDiversity_Worskheet.pdf).In the R code chunk below, please provide the code to: 1) Clear your R environment, 2) Print your current working directory, 3) Set your working directory to your 5.AlphaDiversity folder, and 4) Load the vegan R package (be sure to install first if you haven’t already).
rm(list=ls())
getwd()
## [1] "C:/Users/dusti/GitHub/QB2019_Brewer/2.Worksheets/5.AlphaDiversity"
setwd("C:/Users/dusti/GitHub/QB2019_Brewer/2.Worksheets/5.AlphaDiversity")
install.packages("vegan", repos = "http://cran.us.r-project.org")
## Installing package into 'C:/Users/dusti/Documents/R/win-library/3.4'
## (as 'lib' is unspecified)
## package 'vegan' successfully unpacked and MD5 sums checked
##
## The downloaded binary packages are in
## C:\Users\dusti\AppData\Local\Temp\RtmpKcJPGB\downloaded_packages
library("vegan")
## Warning: package 'vegan' was built under R version 3.4.4
## Loading required package: permute
## Warning: package 'permute' was built under R version 3.4.4
## Loading required package: lattice
## This is vegan 2.5-3
In the R code chunk below, do the following: 1) Load the BCI dataset, and 2) Display the structure of the dataset (if the structure is long, use the max.level = 0 argument to show the basic information).
data(BCI)
BCI
str(BCI)
## 'data.frame': 50 obs. of 225 variables:
## $ Abarema.macradenia : int 0 0 0 0 0 0 0 0 0 1 ...
## $ Vachellia.melanoceras : int 0 0 0 0 0 0 0 0 0 0 ...
## $ Acalypha.diversifolia : int 0 0 0 0 0 0 0 0 0 0 ...
## $ Acalypha.macrostachya : int 0 0 0 0 0 0 0 0 0 0 ...
## $ Adelia.triloba : int 0 0 0 3 1 0 0 0 5 0 ...
## $ Aegiphila.panamensis : int 0 0 0 0 1 0 1 0 0 1 ...
## $ Alchornea.costaricensis : int 2 1 2 18 3 2 0 2 2 2 ...
## $ Alchornea.latifolia : int 0 0 0 0 0 1 0 0 0 0 ...
## $ Alibertia.edulis : int 0 0 0 0 0 0 0 0 0 0 ...
## $ Allophylus.psilospermus : int 0 0 0 0 1 0 0 0 0 0 ...
## $ Alseis.blackiana : int 25 26 18 23 16 14 18 14 16 14 ...
## $ Amaioua.corymbosa : int 0 0 0 0 0 0 0 0 0 0 ...
## $ Anacardium.excelsum : int 0 0 0 0 0 0 0 1 0 0 ...
## $ Andira.inermis : int 0 0 0 0 1 1 0 0 1 0 ...
## $ Annona.spraguei : int 1 0 1 0 0 0 0 1 1 0 ...
## $ Apeiba.glabra : int 13 12 6 3 4 10 5 4 5 5 ...
## $ Apeiba.tibourbou : int 2 0 1 1 0 0 0 1 0 0 ...
## $ Aspidosperma.desmanthum : int 0 0 0 1 1 1 0 0 0 1 ...
## $ Astrocaryum.standleyanum : int 0 2 1 5 6 2 2 0 2 1 ...
## $ Astronium.graveolens : int 6 0 1 3 0 1 2 2 0 0 ...
## $ Attalea.butyracea : int 0 1 0 0 0 1 1 0 0 0 ...
## $ Banara.guianensis : int 0 0 0 0 0 0 0 0 0 0 ...
## $ Beilschmiedia.pendula : int 4 5 7 5 8 6 5 9 11 14 ...
## $ Brosimum.alicastrum : int 5 2 4 3 2 2 6 4 3 6 ...
## $ Brosimum.guianense : int 0 0 0 0 0 0 0 0 0 0 ...
## $ Calophyllum.longifolium : int 0 2 0 2 1 2 2 2 2 0 ...
## $ Casearia.aculeata : int 0 0 0 0 0 0 0 1 0 0 ...
## $ Casearia.arborea : int 1 1 3 2 4 1 2 3 9 7 ...
## $ Casearia.commersoniana : int 0 0 1 0 1 0 0 0 1 0 ...
## $ Casearia.guianensis : int 0 0 0 0 0 0 0 0 0 0 ...
## $ Casearia.sylvestris : int 2 1 0 0 0 3 1 0 1 1 ...
## $ Cassipourea.guianensis : int 2 0 1 1 3 4 4 0 2 1 ...
## $ Cavanillesia.platanifolia : int 0 0 0 0 0 0 0 0 0 0 ...
## $ Cecropia.insignis : int 12 5 7 17 21 4 0 7 2 16 ...
## $ Cecropia.obtusifolia : int 0 0 0 0 1 0 0 2 0 2 ...
## $ Cedrela.odorata : int 0 0 0 0 0 0 0 0 0 0 ...
## $ Ceiba.pentandra : int 0 1 1 0 1 0 0 1 0 1 ...
## $ Celtis.schippii : int 0 0 0 2 2 0 1 0 0 0 ...
## $ Cespedesia.spathulata : int 0 0 0 0 0 0 0 0 0 0 ...
## $ Chamguava.schippii : int 0 0 0 0 0 0 0 0 0 0 ...
## $ Chimarrhis.parviflora : int 0 0 0 0 0 0 0 0 0 0 ...
## $ Maclura.tinctoria : int 0 0 0 0 0 0 0 0 0 0 ...
## $ Chrysochlamys.eclipes : int 0 0 0 0 0 0 0 0 0 0 ...
## $ Chrysophyllum.argenteum : int 4 1 2 2 6 2 3 2 4 2 ...
## $ Chrysophyllum.cainito : int 0 0 0 0 0 0 1 0 0 0 ...
## $ Coccoloba.coronata : int 0 0 0 1 2 0 0 1 2 1 ...
## $ Coccoloba.manzinellensis : int 0 0 0 0 0 0 0 2 0 0 ...
## $ Colubrina.glandulosa : int 0 0 0 0 0 0 0 0 0 0 ...
## $ Cordia.alliodora : int 2 3 3 7 1 1 2 0 0 2 ...
## $ Cordia.bicolor : int 12 14 35 23 13 7 5 10 7 13 ...
## $ Cordia.lasiocalyx : int 8 6 6 11 7 6 6 3 0 4 ...
## $ Coussarea.curvigemma : int 0 0 0 1 0 2 1 0 1 1 ...
## $ Croton.billbergianus : int 2 2 0 11 6 0 0 4 2 0 ...
## $ Cupania.cinerea : int 0 0 0 0 0 0 0 0 0 0 ...
## $ Cupania.latifolia : int 0 0 0 1 0 0 0 0 0 0 ...
## $ Cupania.rufescens : int 0 0 0 0 0 0 0 0 0 0 ...
## $ Cupania.seemannii : int 2 2 1 0 3 0 1 2 2 0 ...
## $ Dendropanax.arboreus : int 0 3 6 0 5 2 1 6 1 3 ...
## $ Desmopsis.panamensis : int 0 0 4 0 0 0 0 0 0 1 ...
## $ Diospyros.artanthifolia : int 1 1 1 1 0 0 0 0 0 1 ...
## $ Dipteryx.oleifera : int 1 1 3 0 0 0 0 2 1 2 ...
## $ Drypetes.standleyi : int 2 1 2 0 0 0 0 0 0 0 ...
## $ Elaeis.oleifera : int 0 0 0 0 0 0 0 0 0 0 ...
## $ Enterolobium.schomburgkii : int 0 0 0 0 0 0 0 0 0 0 ...
## $ Erythrina.costaricensis : int 0 0 0 0 0 3 0 0 1 0 ...
## $ Erythroxylum.macrophyllum : int 0 1 0 0 0 0 0 1 1 1 ...
## $ Eugenia.florida : int 0 1 0 7 2 0 0 1 1 3 ...
## $ Eugenia.galalonensis : int 0 0 0 0 0 0 0 1 0 0 ...
## $ Eugenia.nesiotica : int 0 0 1 0 0 0 5 4 3 0 ...
## $ Eugenia.oerstediana : int 3 2 5 1 5 2 2 3 3 3 ...
## $ Faramea.occidentalis : int 14 36 39 39 22 16 38 41 33 42 ...
## $ Ficus.colubrinae : int 0 1 0 0 0 0 0 0 0 0 ...
## $ Ficus.costaricana : int 0 0 0 0 0 0 0 0 0 0 ...
## $ Ficus.insipida : int 0 0 0 0 0 0 0 0 0 0 ...
## $ Ficus.maxima : int 1 0 0 0 0 0 0 0 0 0 ...
## $ Ficus.obtusifolia : int 0 0 0 0 0 0 0 0 0 0 ...
## $ Ficus.popenoei : int 0 0 0 0 0 0 1 0 0 0 ...
## $ Ficus.tonduzii : int 0 0 1 2 1 0 0 0 0 0 ...
## $ Ficus.trigonata : int 0 0 0 0 0 0 0 0 0 0 ...
## $ Ficus.yoponensis : int 1 0 0 0 0 1 1 0 0 0 ...
## $ Garcinia.intermedia : int 0 1 1 3 2 1 2 2 1 0 ...
## $ Garcinia.madruno : int 4 0 0 0 1 0 0 0 0 1 ...
## $ Genipa.americana : int 0 0 1 0 0 0 1 0 1 1 ...
## $ Guapira.myrtiflora : int 3 1 0 1 1 7 3 1 1 1 ...
## $ Guarea.fuzzy : int 1 1 0 1 3 0 0 2 0 3 ...
## $ Guarea.grandifolia : int 0 0 0 0 0 0 0 1 0 0 ...
## $ Guarea.guidonia : int 2 6 2 5 3 4 4 0 1 5 ...
## $ Guatteria.dumetorum : int 6 16 6 3 9 7 8 6 2 2 ...
## $ Guazuma.ulmifolia : int 0 0 0 1 0 0 0 0 0 0 ...
## $ Guettarda.foliacea : int 1 5 1 2 1 0 0 4 1 3 ...
## $ Gustavia.superba : int 10 5 0 1 3 1 8 4 4 4 ...
## $ Hampea.appendiculata : int 0 0 1 0 0 0 0 0 2 1 ...
## $ Hasseltia.floribunda : int 5 9 4 11 9 2 7 6 3 4 ...
## $ Heisteria.acuminata : int 0 0 0 0 1 1 0 0 0 0 ...
## $ Heisteria.concinna : int 4 5 4 6 4 8 2 5 1 5 ...
## $ Hirtella.americana : int 0 0 0 0 0 0 0 0 0 0 ...
## $ Hirtella.triandra : int 21 14 5 4 6 6 7 14 8 7 ...
## $ Hura.crepitans : int 0 0 0 0 0 2 1 1 0 0 ...
## $ Hieronyma.alchorneoides : int 0 2 0 0 0 0 0 0 1 0 ...
## [list output truncated]
## - attr(*, "original.names")= chr "Abarema.macradenium" "Acacia.melanoceras" "Acalypha.diversifolia" "Acalypha.macrostachya" ...
Species richness (S) refers to the number of species in a system or the number of species observed in a sample.
In the R code chunk below, do the following:
Write a function called S.obs to calculate observed richness
Use your function to determine the number of species in site1 of the BCI data set, and
Compare the output of your function to the output of the specnumber() function in vegan.
S.obs <- function(x = ""){
rowSums(x > 0) * 1
}
BCI <- BCI
BCI
site1 <- (BCI[1,])
site1
S.obs(site1)
## 1
## 93
specnumber(BCI[1,])
## 1
## 93
specnumber(BCI[4,])
## 4
## 94
Question 1: Does specnumber() from vegan return the same value for observed richness in site1 as our function S.obs? What is the species richness of the first four sites (i.e., rows) of the BCI matrix?
Answer 1: Yes, specnumber() and our function S.obs() return the same value for observed richness, 93. For the 1st four sites, the species richness is 94.
In the R code chunk below, do the following:
Write a function to calculate Good’s Coverage, and
Use that function to calculate coverage for all sites in the BCI matrix.
C <- function(x = ""){
1 - (rowSums(x == 1) / rowSums(x))
}
C(BCI[,])
## 1 2 3 4 5 6 7
## 0.9308036 0.9287356 0.9200864 0.9468504 0.9287129 0.9174757 0.9326923
## 8 9 10 11 12 13 14
## 0.9443155 0.9095355 0.9275362 0.9152120 0.9071038 0.9242054 0.9132420
## 15 16 17 18 19 20 21
## 0.9350649 0.9267735 0.8950131 0.9193084 0.8891455 0.9114219 0.8946078
## 22 23 24 25 26 27 28
## 0.9066986 0.8705882 0.9030612 0.9095023 0.9115479 0.9088729 0.9198966
## 29 30 31 32 33 34 35
## 0.8983516 0.9221053 0.9382423 0.9411765 0.9220183 0.9239374 0.9267887
## 36 37 38 39 40 41 42
## 0.9186047 0.9379310 0.9306488 0.9268868 0.9386503 0.8880597 0.9299517
## 43 44 45 46 47 48 49
## 0.9140049 0.9168704 0.9234234 0.9348837 0.8847059 0.9228916 0.9086651
## 50
## 0.9143519
Question 2: Answer the following questions about coverage:
site1 was represented by singletons?Answer 2a: 0 to 1
Answer 2b: My impression is that all of the observations would be singletons, and so coverage wouldn’t be good.
Answer 2c: Only about 7% were represented by singltons
Answer 2d: Overall, coverage seems quite good. The smallest proportion is 0.88, and most are 0.9 or above.
In the R code chunk below, do the following:
Load the microbial dataset (located in the 5.AlphaDiversity/data folder),
Transform and transpose the data as needed (see handout),
Create a new vector (soilbac1) by indexing the bacterial OTU abundances of any site in the dataset,
Calculate the observed richness at that particular site, and
Calculate coverage of that site
soilbac <- read.table("data/soilbac.txt", sep = "\t", header = TRUE, row.names = 1)
soilbac
soilbac.t <- as.data.frame(t(soilbac))
soilbac.t
soilbac1 <- soilbac.t[1,]
S.obs(soilbac1)
## T1_1
## 1074
C(soilbac1)
## T1_1
## 0.6479471
Question 3: Answer the following questions about the soil bacterial dataset.
soilbac1, i.e. N?soilbac1?site1) and the KBS sample (soilbac1)?Answer 3a: 13,310
Answer 3b: 1074
Answer 3c: Coverage was less for the KBS sample than the BCI sample.
In the R code chunk below, do the following:
Write a function to calculate Chao1,
Write a function to calculate Chao2,
Write a function to calculate ACE, and
Use these functions to estimate richness at site1 and soilbac1.
Chao1 <- function(x = ""){
S.obs(x) + (sum(x == 1)^2) / (2*sum(x ==2))
}
Chao2 <- function(site = "", SbyS = ""){
SbyS = as.data.frame(SbyS)
x = SbyS[site, ]
SbyS.pa <- (SbyS > 0) * 1
Q1 = sum(colSums(SbyS.pa) == 1)
Q2 = sum(colSums(SbyS.pa) == 2)
Chao2 = S.obs(x) + (Q1^2) / (2*Q2)
return(Chao2)
}
ACE <- function(x = "", thresh = 10){
x <- x[x>0]
S.abund <- length(which(x > thresh))
S.rare <- length(which(x <= thresh))
singlt <- length(which(x == 1))
N.rare <- sum(x[which(x <= thresh)])
C.ace <- 1 - (singlt / N.rare)
i <- c(1:thresh)
count <- function(i, y){
length(y[y == i])
}
a.1 <- sapply(i, count, x)
f.1 <- (i * (i - 1)) * a.1
G.ace <- (S.rare/C.ace)*(sum(f.1)/(N.rare*(N.rare-1)))
S.ace <- S.abund + (S.rare/C.ace) + (singlt/C.ace) * max(G.ace,0)
return(S.ace)
}
Chao1(site1)
## 1
## 119.6944
Chao1(soilbac1)
## T1_1
## 2628.514
ACE(site1)
## [1] 159.3404
ACE(soilbac1)
## [1] 4465.983
Chao2(1, BCI)
## 1
## 104.6053
Chao2(1, soilbac.t)
## T1_1
## 21055.39
Question 4: What is the difference between ACE and the Chao estimators? Do the estimators give consistent results? Which one would you choose to use and why?
Answer 4: Ace and Chao estimators differ in that the ACE estimator utilizes a threshold for rare species as less than 10, which more broadly defines a rare species (compared to singletons and doubletons of the chao estimators). All of the estimators produced similar estimates for the BCI dataset, though for the soilbac data set the Chao2 estimator produced a higher estimate for richness than the other estimators (>20,000 vs 5000). I’m guessing this is because the Chao2 considered multiple sites and there are many more bacteria species than tree species. I’m not sure what estimator I should use, but my impression is that it may depend on the situation and sample size (perhaps the threshold of 10 is inappropriate in some situations (use Chao1 or 2)? perhaps I want to measure richness at more than one site (use Chao2)?
In the R code chunk below, please do the following:
Calculate observed richness for all samples in soilbac,
Determine the size of the smallest sample,
Use the rarefy() function to rarefy each sample to this level,
Plot the rarefaction results, and
Add the 1:1 line and label.
soilbac.S <- S.obs(soilbac.t)
soilbac.S
## T1_1 T1_2 T1_3 T7_1 T7_2 T7_3 DF_1 DF_2 CF_1 CF_2 CF_3
## 1074 1302 1174 1416 1406 1143 1806 1151 924 1122 851
min.N <- min(rowSums(soilbac.t))
min.N
## [1] 2119
S.rarefy <- rarefy(x = soilbac.t, sample = min.N, se = TRUE)
rarecurve(x = soilbac.t, step = 20, col = "blue", cex = 0.6, las = 1)
abline(0, 1, col = 'red')
text(1500, 1500, "1:1", pos = 2, col = 'red')
Here, we consider how abundance varies among species, that is, species evenness.
One of the most common ways to visualize evenness is in a rank-abundance curve (sometime referred to as a rank-abundance distribution or Whittaker plot). An RAC can be constructed by ranking species from the most abundant to the least abundant without respect to species labels (and hence no worries about ‘ties’ in abundance).
In the R code chunk below, do the following:
Write a function to construct a RAC,
Be sure your function removes species that have zero abundances,
Order the vector (RAC) from greatest (most abundant) to least (least abundant), and
Return the ranked vector
RAC <- function(x = ""){
x = as.vector(x)
x.ab = x[x > 0]
x.ab.ranked = x.ab[order(x.ab, decreasing = TRUE)]
return(x.ab.ranked)
}
Now, let’s examine the RAC for site1 of the BCI data set.
In the R code chunk below, do the following:
Create a sequence of ranks and plot the RAC with natural-log-transformed abundances,
Label the x-axis “Rank in abundance” and the y-axis “log(abundance)”
plot.new()
rac <- RAC(x = site1)
ranks <- as.vector(seq(1, length(rac)))
opar <- par(no.readonly = TRUE)
par(mar = c(5.1, 5.1, 4.1, 2.1))
plot(ranks, log(rac), type = 'p', axes = F,
xlab = "Rank in abundance", ylab = "Abundance",
las = 1, cex.lab = 1.4, cex.axis = 1.25)
box()
axis(side = 1, labels = T, cex.axis = 1.25)
axis(side = 2, las = 1, cex.axis = 1.225,
labels = c(1, 2, 5, 10, 20), at = log(c(1,2,5,10,20)))
Question 5: What effect does visualizing species abundance data on a log-scaled axis have on how we interpret evenness in the RAC?
Answer 5:The larger values don’t seem as large as they really are compared to the smaller values, based on distance between ticks on y axis. This allows for a greater range of values to be displayed in a smaller area, but can be mis-leading if one does not pay attention to the scale (in which case abundance #s might seem more similar than they really are).
Now that we have visualized unevennes, it is time to quantify it using Simpson’s evenness (\(E_{1/D}\)) and Smith and Wilson’s evenness index (\(E_{var}\)).
In the R code chunk below, do the following:
Write the function to calculate \(E_{1/D}\), and
Calculate \(E_{1/D}\) for site1.
SimpE <- function(x = ""){
S <- S.obs(x)
x = as.data.frame(x)
D <- diversity(x, "inv")
E <- (D)/S
return(E)
}
SimpE(site1)
## 1
## 0.4238232
In the R code chunk below, please do the following:
Write the function to calculate \(E_{var}\),
Calculate \(E_{var}\) for site1, and
Compare \(E_{1/D}\) and \(E_{var}\).
Evar <- function(x){
x <- as.vector(x[x > 0])
1 - (2/pi)*atan(var(log(x)))
}
Evar(site1)
## [1] 0.5067211
Question 6: Compare estimates of evenness for site1 of BCI using \(E_{1/D}\) and \(E_{var}\). Do they agree? If so, why? If not, why? What can you infer from the results.
Answer 6: The results from Simpson’s Index compared to Smith and Wilson’s Index are similar (0.42 vs 0.51, respectively). They are slightly different because the Simpson’s Index gave more weight to abundant species, which I assume resulted in less evenness. Smith and Wilson’s Index provided a less biased estimate of eveness.
So far, we have introduced two primary aspects of diversity, i.e., richness and evenness. Here, we will use popular indices to estimate diversity, which explicitly incorporate richness and evenness We will write our own diversity functions and compare them against the functions in vegan.
In the R code chunk below, please do the following:
Provide the code for calculating H’ (Shannon’s diversity),
Compare this estimate with the output of vegan’s diversity function using method = “shannon”.
ShanH <- function(x = ""){
H = 0
for (n_i in x) {
if(n_i > 0) {
p = n_i / sum(x)
H = H - p*log(p)
}
}
return(H)
}
ShanH(site1)
## [1] 4.018412
diversity(site1, index = "shannon")
## [1] 4.018412
In the R code chunk below, please do the following:
Provide the code for calculating D (Simpson’s diversity),
Calculate both the inverse (1/D) and 1 - D,
Compare this estimate with the output of vegan's diversity function using method = “simp”.
SimpD <- function(x = "") {
D = 0
N = sum(x)
for (n_i in x){
D = D + (n_i^2)/(N^2)
}
return(D)
}
D.inv <- 1/SimpD(site1)
D.sub <- 1-SimpD(site1)
D.inv
## [1] 39.41555
D.sub
## [1] 0.9746293
diversity(site1, "inv")
## [1] 39.41555
diversity(site1, "simp")
## [1] 0.9746293
Question 7: Compare estimates of evenness for site1 of BCI using \(E_{H'}\) and \(E_{var}\). Do they agree? If so, why? If not, why? What can you infer from the results.
Answer 7: Smith and Wilson’s Evenness Index (Evar) output an evenness of 0.5 for site 1 of BCI, and the Shannon Index (H’) output a value of 4.0. H’ is measuring diversity by incorporating evenness and richness, while Evar is measuring evenness. Both #s seem to suggest that there is a moderate amount of diversity at the site, but don’t seem to be directly comparable because they measure different things.
In the R code chunk below, please do the following:
Provide the code for calculating Fisher’s \(\boldsymbol\alpha\),
Calculate Fisher’s \(\boldsymbol\alpha\) for site1 of BCI.
rac <- as.vector(site1[site1 > 0])
invD <- diversity(rac, "inv")
invD
## [1] 39.41555
Fisher <- fisher.alpha(rac)
Fisher
## [1] 35.67297
Question 8: How is Fisher’s \(\boldsymbol\alpha\) different from \(E_{H'}\) and \(E_{var}\)? What does Fisher’s \(\boldsymbol\alpha\) take into account that \(E_{H'}\) and \(E_{var}\) do not?
Answer 8: Fisher’s metric is different than Evar and H’ because using Fisher’s metric, diversity is estimated rather than just creating a diversity metric. Fisher’s metric accounts for sampling error.
The diversity metrics that we just learned about attempt to integrate richness and evenness into a single, univariate metric. Although useful, information is invariably lost in this process. If we go back to the rank-abundance curve, we can retrieve additional information – and in some cases – make inferences about the processes influencing the structure of an ecological system.
The RAC is a simple data structure that is both a vector of abundances. It is also a row in the site-by-species matrix (minus the zeros, i.e., absences).
Predicting the form of the RAC is the first test that any biodiversity theory must pass and there are no less than 20 models that have attempted to explain the uneven form of the RAC across ecological systems.
In the R code chunk below, please do the following:
Use the radfit() function in the vegan package to fit the predictions of various species abundance models to the RAC of site1 in BCI,
Display the results of the radfit() function, and
Plot the results of the radfit() function using the code provided in the handout.
RACresults <- radfit(site1)
RACresults
##
## RAD models, family poisson
## No. of species 93, total abundance 448
##
## par1 par2 par3 Deviance AIC BIC
## Null 39.5261 315.4362 315.4362
## Preemption 0.042797 21.8939 299.8041 302.3367
## Lognormal 1.0687 1.0186 25.1528 305.0629 310.1281
## Zipf 0.11033 -0.74705 61.0465 340.9567 346.0219
## Mandelbrot 100.52 -2.312 24.084 4.2271 286.1372 293.7350
plot.new()
plot(RACresults, las = 1, cex.lab = 1.4, cex.axis = 1.25)
Question 9: Answer the following questions about the rank abundance curves: a) Based on the output of radfit() and plotting above, discuss which model best fits our rank-abundance curve for site1? b) Can we make any inferences about the forces, processes, and/or mechanisms influencing the structure of our system, e.g., an ecological community?
Answer 9a: The Mandelbrot model best fits our data, given that the AIC and BIC values are lowest, and that the line on the plot best tracks the data points. Answer 9b: It seems that in the system being analyzed, and many others, there are conditions that allow for a small # of species to be more abundant and a larger # species to be less abundant, which may be due to the predomination of a particular habitat type and niche partitioning amongst available habitats.
Question 10: Answer the following questions about the preemption model: a. What does the preemption model assume about the relationship between total abundance (N) and total resources that can be preempted? b. Why does the niche preemption model look like a straight line in the RAD plot?
Answer 10a: The preemption model assumes that species that are less abundant have less resources available to them. When there are more species present, there are less resources available for others. Answer 10b: The model assumes a constant decay in abundance of species when going from most the abundant species to the least abundant species.
Question 11: Why is it important to account for the number of parameters a model uses when judging how well it explains a given set of data?
Answer 11: Overfitting results when a model with many parameters is able to describe very well what was sampled specifically, but becomes worse at describing the real world. Thus, a more simple model may be better representing the real world that one is interested in, rather than just the sampled data.
site 1 of the BCI site-by-species matrix.SimpDFin <- function(x = “”) { D = 0 N = sum(x) for (n_i in x){ D = D + (n_i(n_i -1))/ (N(N-2)) } return(D) }
D.inv2 <- 1/SimpDFin(site1) D.inv2
D.sub2 <- 1-SimpDFin(site1) D.sub2
D <- SimpDFin(site1) D
site 1 of the BCI site-by-species matrix, and describe the general pattern you see.hist(rac)
The histogram shows that the vast majority of species (60+) have abundances of less than 5, and the rest (about 20 total) have abundances of 5 or more.
Load that dataset.
Woolly <- read.csv(“data/hf08501bird.csv”, sep = “,”, header = TRUE)
Woolly
How many sites are there?
There are 19 sites.
How many species are there in the entire site-by-species matrix?
There are 49 species.
Any other interesting observations based on what you learned this week?
Each site is categorized as being affected by hemlock woolly adelgid to a varying degree, e.g. high, medium, and low. It will be interesting to find out how bird diversity has been affected by the hemlock woolly adelgid.
Use Knitr to create a PDF of your completed alpha_assignment.Rmd document, push it to GitHub, and create a pull request. Please make sure your updated repo include both the HTML and RMarkdown files.
Unless otherwise noted, this assignment is due on Wednesday, January 23rd, 2017 at 12:00 PM (noon).